MAP Lexicon is Useful for Segmentation and Word Discovery in Child Directed Speech

نویسنده

Anand Raman

چکیده

An efficient algorithm for segmenting child-directed speech into words has recently been proposed in the Machine Learning journal. This short technical note proposes some modifications to this algorithm. In particular, a slightly more conservative variation of the original approach is proposed that infers word boundaries based simply on the maximum a-posteriori lexicon. Results of empirical tests illustrating the value of the proposed modifications are also presented.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Relating Unsupervised Word Segmentation to Reported Vocabulary Acquisition

A range of computational approaches have been used to model the discovery of word forms from continuous speech by infants. Typically, these algorithms are evaluated with respect to the ideal ‘gold standard’ word segmentation and lexicon. These metrics assess how well an algorithm matches the adult state, but may not reflect the intermediate states of the child’s lexical development. We set up a...

متن کامل

A statistical model for word discovery in child directed speech

A statistical model for segmentation and word discovery in child directed speech is presented. An incremental unsupervised learning algorithm to infer word boundaries based on this model is described and results of empirical tests showing that the algorithm is competitive with other models that have been used for similar tasks are also presented.

متن کامل

Statistical Speech Segmentation and Word Learning in Parallel: Scaffolding from Child-Directed Speech

In order to acquire their native languages, children must learn richly structured systems with regularities at multiple levels. While structure at different levels could be learned serially, e.g., speech segmentation coming before word-object mapping, redundancies across levels make parallel learning more efficient. For instance, a series of syllables is likely to be a word not only because of ...

متن کامل

Word segmentation in Persian continuous speech using F0 contour

Word segmentation in continuous speech is a complex cognitive process. Previous research on spoken word segmentation has revealed that in fixed-stress languages, listeners use acoustic cues to stress to de-segment speech into words. It has been further assumed that stress in non-final or non-initial position hinders the demarcative function of this prosodic factor. In Persian, stress is retract...

متن کامل

Pii: S0364-0213(01)00061-1

This paper presents an implemented computational model of word acquisition which learns directly from raw multimodal sensory input. Set in an information theoretic framework, the model acquires a lexicon by finding and statistically modeling consistent cross-modal structure. The model has been implemented in a system using novel speech processing, computer vision, and machine learning algorithm...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1999

MAP Lexicon is Useful for Segmentation and Word Discovery in Child Directed Speech

نویسنده

چکیده

منابع مشابه

Relating Unsupervised Word Segmentation to Reported Vocabulary Acquisition

A statistical model for word discovery in child directed speech

Statistical Speech Segmentation and Word Learning in Parallel: Scaffolding from Child-Directed Speech

Word segmentation in Persian continuous speech using F0 contour

Pii: S0364-0213(01)00061-1

عنوان ژورنال:

اشتراک گذاری